K-D Decision Tree: An Accelerated and Memory Efficient Nearest Neighbor Classifier
نویسندگان
چکیده
This paper presents a novel Nearest Neighbor (NN) classifier. NN classification is a well studied method for pattern classification having the following properties; * it performs maximum-margin classification and achieves less than the twice of ideal Bayesian error, * it does not require the knowledge on pattern distributions, kernel functions or base classifiers, and * it can naturally be applied to multiclass classification problems. The drawbacks are A) inefficient memory use, B) ineffective pattern classification speed, and so on. This paper deals with the problems A and B. In most of the cases, NN search algorithms, such as k-d tree, are employed as a pattern search engine of the NN classifier. However, NN classification does not always require the NN search. Based on this idea, we propose a novel algorithm named k-d decision tree (KDDT). Since KDDT requires Voronoi condensed prototypes, it is less memory consuming than naive NN classifiers. We have confirmed that KDDT is much faster than NN search based classifier through the comparative experiment (from 9 to 369 times faster than NN search based classifier). Keyword nearest neighbor classifier, k-d tree, Voronoi condensing, local nearest neighbor search, safe node merging
منابع مشابه
Lazy Classifiers Using P-trees
Lazy classifiers store all of the training samples and do not build a classifier until a new sample needs to be classified. It differs from eager classifiers, such as decision tree induction, which build a general model (such as a decision tree) before receiving new samples. K-nearest neighbor (KNN) classification is a typical lazy classifier. Given a set of training data, a knearest neighbor c...
متن کاملImproving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering
Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...
متن کاملNearest Neighbor Classification Using The Layered Range Tree
Finding Nearest Neighbors efficiently is crucial to the design of any nearest neighbor classifier. This paper shows how Layered Range Trees could be used for efficient nearest neighbor classification. The presented algorithm is simple and finds the nearest neighbor in a logarithmic order. It performs d log n + k distance measures to find the nearest neighbor, where k is a constant that is much ...
متن کاملPerformance Comparison between Naïve Bayes, Decision Tree and k-Nearest Neighbor in Searching Alternative Design in an Energy Simulation Tool
Energy simulation tool is a tool to simulate energy use by a building prior to the erection of the building. Commonly it has a feature providing alternative designs that are better than the user’s design. In this paper, we propose a novel method in searching alternative design that is by using classification method. The classifiers we use are Naïve Bayes, Decision Tree, and k-Nearest Neighbor. ...
متن کاملA Comparison of Nearest Neighbor Search Algorithms for Generic Object Recognition
The nearest neighbor (NN) classifier is well suited for generic object recognition. However, it requires storing the complete training data, and classification time is linear in the amount of data. There are several approaches to improve runtime and/or memory requirements of nearest neighbor methods: Thinning methods select and store only part of the training data for the classifier. Efficient ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEICE Transactions
دوره 93-D شماره
صفحات -
تاریخ انتشار 2003